Code Generation for Efficient Query Processing in Managed Runtimes
نویسندگان
چکیده
In this paper we examine opportunities arising from the convergence of two trends in data management: in-memory database systems (IMDBs), which have received renewed attention following the availability of affordable, very large main memory systems; and language-integrated query, which transparently integrates database queries with programming languages (thus addressing the famous ‘impedance mismatch’ problem). Language-integrated query not only gives application developers a more convenient way to query external data sources like IMDBs, but also to use the same querying language to query an application’s in-memory collections. The latter offers further transparency to developers as the query language and all data is represented in the data model of the host programming language. However, compared to IMDBs, this additional freedom comes at a higher cost for query evaluation. Our vision is to improve in-memory query processing of application objects by introducing database technologies to managed runtimes. We focus on querying and we leverage query compilation to improve query processing on application objects. We explore different query compilation strategies and study how they improve the performance of query processing over application data. We take C] as the host programming language as it supports languageintegrated query through the LINQ framework. Our techniques deliver significant performance improvements over the default LINQ implementation. Our work makes important first steps towards a future where data processing applications will commonly run on machines that can store their entire datasets in-memory, and will be written in a single programming language employing languageintegrated query and IMDB-inspired runtimes to provide transparent and highly efficient querying.
منابع مشابه
Efficient query processing in managed runtimes
This thesis presents strategies to improve the query evaluation performance over huge volumes of relational-like data that is stored in the memory space of managed applications. Storing and processing application data in the memory space of managed applications is motivated by the convergence of two recent trends in data management. First, dropping DRAM prices have led to memory capacities that...
متن کاملDynamic Taint Tracking in Managed Runtimes
This paper provides a taxonomy of runtime taint tracking approaches for managed code, such as code written in Java, C#, PHP, Perl, or Ruby. It covers main applications of data tainting such as preventing web application vulnerabilities including crosssite scripting and SQL injection attacks, along with disallowing privacy-sensitive data leaks. In addition to giving an overview of related litera...
متن کاملDeclarative Query Processing in Imperative Managed Runtimes
The falling price of main memory has led to the development and growth of in-memory databases. At the same time, new advances in memory technology, like persistent memory, make it possible to have a truly universal storage model, accessed directly through the programming language in the context of a fully managed runtime. This environment is further enhanced by language-integrated query, which ...
متن کاملCompiling Database Queries into Machine Code
On modern servers the working set of database management systems becomes more and more main memory resident. Slow disk accesses are largely avoided, and thus the in-memory processing speed of databases becomes an important factor. One very attractive approach for fast query processing is justin-time compilation of incoming queries. By producing machine code at runtime we avoid the overhead of t...
متن کاملSpeculative Execution of Parallel Programs with Precise Exception Semantics on GPUs
General purpose computing on GPUs (GPGPU) can enable significant performance and energy improvements for certain classes of applications. However, current GPGPU programming models, such as CUDA and OpenCL, are only accessible by systems experts through lowlevel C/C++ APIs. In contrast, large numbers of programmers use highlevel languages, such as Java, due to their productivity advantages of ty...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PVLDB
دوره 7 شماره
صفحات -
تاریخ انتشار 2014